gguf : use Qn_K for k-quants instead of KQn #837
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#822 (by @mofosyne) has introduced a naming convention for GGUF model files, but the way it names k-quants doesn't follow the established practice (all other places where k-quants are named use
Qn_Kwherenis the number of bits per weight excluding the scales).rg -i 'KQ\d'doesn't return anything related to quants except for this recently-added section, whilerg -i 'Q\d_K'returns a lot of things related to k-quants when run inggmlandllama.cppreposSo this renames
KQ2toQ2_K, for consistency. This should avoid unnecessary confusion.(note that the recently-added wiki page about "tensor encoding schemes" will need to be updated too, since it is the only other place I found to also use this
KQ<X>naming scheme)